Web Data Collection with R

Course Taster Teaser

Peter Meißner / 2016-02-29 – 2016-03-04 / ECPR WSMT

Introduction

Course Taster

find a course taster at:

http://pmeissner.com/downloads/user2015_meissner_webscraping.pdf

Course Teaser

… back to the course teaser

THE WEB

THE PROBLEMS

phase problems examples
download protocols HTTP, HTTPS, POST, GET, …
  procedures cookies, authentication, forms, …
————– ————– ——————————
extraction parsing translating HTML (XML, JSON, …) into R
  extraction getting the relevant parts
  cleansing cleaning up, restructure, combine

THE SOLUTION

the solution

Applications

MP Biographies

Bailer, Meißner, Ohmura, Selb (2013): Seiteneinsteiger im Deutschen Bundestag. Springer VS

MP Biographies

Bailer, Meißner, Ohmura, Selb (2013): Seiteneinsteiger im Deutschen Bundestag. Springer VS

Legislative Process

Legislative Process

Legislative Process

Legislative Process

Legislative Process

Wikipedia Page Views - IS

Policy Effects

Mass Idealpoint Estimation

Barbera (2014)

News Based War Prediction

Chadefaux (2014)

Collective Action and Organization Formation

Shaw & Hill (2014)

Electoral Rule Effects

Street et al. (2015)

Mobil Phone Meta Data

Mobil Phone Meta Data

Name Distribution

##…

Conclusion

Conclusion